The Impact of Noise in Web Genre Identification
نویسندگان
چکیده
Genre detection of web documents fits an open-set classification task. The web documents not belonging to any predefined genre or where multiple genres co-exist is considered as noise. In this work we study the impact of noise on automated genre identification within an open-set classification framework. We examine alternative classification models and document representation schemes based on two corpora, one without noise and one with noise showing that the recently proposed RFSE model can remain robust with noise. Moreover, we show how that the identification of certain genres is not practically affected by the presence of noise.
منابع مشابه
Effects of Exposure to Impact Noise on the Hearing of Armed Forces and Evaluation of the Methods to Control and Decrease its Consequences: A Review Study
Background and Aim: Exposure to impact noise (short-term, high intensity) higher than permitted levels results in injury to the auditory system. Armed forces are one of the occupational groups exposed to these types of noises resulting from gunshots. In this study, relevant articles and research on the adverse effects of impact noise, hearing loss, and tinnitus in armed forces and effective con...
متن کاملThe Impact of Linear Process versus Genre-Based Approach on Intermediate EFL Learners’ Accuracy in Written Task Performance
The main purpose of the present quasi-experimental study was to investigate the effects of linear process versus genre-based approach on EFL learners’ written production. To this end, 40 learners of English at intermediate level were randomly selected as the participants of the study and assigned into two groups of experimental (process and genre) which received different types of instruction f...
متن کاملImplementing a Characterization of Genre for Automatic Genre Identification of Web Pages
In this paper, we propose an implementable characterization of genre suitable for automatic genre identification of web pages. This characterization is implemented as an inferential model based on a modified version of Bayes’ theorem. Such a model can deal with genre hybridism and individualization, two important forces behind genre evolution. Results show that this approach is effective and is...
متن کاملImproving Data-based Wind Turbine Using Measured Data Foggy Method
The purpose of this paper is to improve the modeling of the data-driven wind turbine system that receives data from noise signals. Most of the data on industrial systems is noisely and data noise is inevitable and natural. The method and idea proposed in this paper, Data Fogging, significantly reduce the impact of noise on data-driven wind turbine system modeling, which is the basis of this met...
متن کاملImpact of Genre-Based Instruction on Development of Students’ Letter Writing Skills: The Case of Students of Textile Engineering
The current study investigated the effectiveness of genre-based instruction on the development of EFL learners’ writing skills. Participants were 34 undergraduate students majoring in textile engineering at an Iranian state university, and they had enrolled in the English for specific academic purposes course. Participants were taught how to write 4 types of business letters, highlighting the p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015